Skip to content

Conformance Tests for SEP-2322 MRTR#188

Merged
CaitieM20 merged 40 commits into
modelcontextprotocol:mainfrom
CaitieM20:mrtr-tests
May 22, 2026
Merged

Conformance Tests for SEP-2322 MRTR#188
CaitieM20 merged 40 commits into
modelcontextprotocol:mainfrom
CaitieM20:mrtr-tests

Conversation

@CaitieM20
Copy link
Copy Markdown
Contributor

Draft Conformance tests for the SEP-2322: Multi Round-Trip Requests

Also added code to client-helper.ts to make rawMCP Requests (i.e. basic json requests) this will be generally useful for draft features that may not have reference implementations yet.

Motivation and Context

See SEP

How Has This Been Tested?

Conformance Tests & Reference Implementation in progress work

Breaking Changes

yes see SEP

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally - existing tests pass draft tests do not since we don't have an implementation yet
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Mar 17, 2026

Open in StackBlitz

npx https://pkg.pr.new/@modelcontextprotocol/conformance@188

commit: 85fe6b7

@panyam
Copy link
Copy Markdown

panyam commented May 6, 2026

Hello - saw this PR while looking at the 2322 finalizing threads. I've been porting our local MRTR + Tasks Extension scenarios into a fork of the official suite at panyam/mcpconformance:feat/tasks-mrtr-extension - looks like our ephemeral-flow scenarios cover similar ground to your A1-A7 set, and we've also built out the wider Tasks Extension perimeter (lifecycle, capability negotiation, dispatch, request-state, headers, notifications) which this PR doesn't span.

The bridge scenario (Tasks + MRTR partial fulfillment) is narrow on our side - 3 checks - vs your incomplete-result-tasks.ts which goes deeper. Looks like the two are mostly complementary.

If you're planning to revive this PR after the SEP finalizes, happy to help refresh wire format and pair on the bridge surface. Otherwise I can open a separate PR for the wider Tasks Extension scope and defer the ephemeral-flow / bridge depth to whatever lands here. Or some merged form, whatever's easiest for you. Just wanted to make sure I wasnt undoing anything 🙏

Comment thread src/scenarios/server/lifecycle.test.ts Outdated
CaitieM20 added 2 commits May 20, 2026 14:21
* add missing tests and update sep-2322.yaml
@CaitieM20 CaitieM20 requested a review from pcarleton May 21, 2026 02:46
@CaitieM20
Copy link
Copy Markdown
Contributor Author

Hey @panyam would love some help here. I've updated this PR now that 2322 is Approved and checked in. I removed the tasks tests and marked the checks as excluded since we have moved Tasks to an extension.

I saw you opened one for the Task extension which I think is the right path forward for the Task Conformance tests. Appreciate the help.

Hello - saw this PR while looking at the 2322 finalizing threads. I've been porting our local MRTR + Tasks Extension scenarios into a fork of the official suite at panyam/mcpconformance:feat/tasks-mrtr-extension - looks like our ephemeral-flow scenarios cover similar ground to your A1-A7 set, and we've also built out the wider Tasks Extension perimeter (lifecycle, capability negotiation, dispatch, request-state, headers, notifications) which this PR doesn't span.

The bridge scenario (Tasks + MRTR partial fulfillment) is narrow on our side - 3 checks - vs your incomplete-result-tasks.ts which goes deeper. Looks like the two are mostly complementary.

If you're planning to revive this PR after the SEP finalizes, happy to help refresh wire format and pair on the bridge surface. Otherwise I can open a separate PR for the wider Tasks Extension scope and defer the ephemeral-flow / bridge depth to whatever lands here. Or some merged form, whatever's easiest for you. Just wanted to make sure I wasnt undoing anything 🙏

@CaitieM20 CaitieM20 enabled auto-merge (squash) May 21, 2026 02:49
@panyam
Copy link
Copy Markdown

panyam commented May 21, 2026

@CaitieM20 Thanks - and yep, totally agree with the split. PR 262 (the Task extension side) is now in a position to be merged. Luca has signed off and asked pcarleton to give it a final look. We're shipping it with two cross-SEP suites deliberately skipped (mrtr-tasks-composition and tasks-status-notifications via subscriptions/listen). Plans are there for a fast-follow for both. These both overlap naturally with the MRTR side here - the composition test in particular needs to encode the asymmetric requestState invariant (MRTR phase carries requestState, Task phase forbids it), which only lands cleanly once both PRs are in.

So when this one merges I can bring on the composition harness against whatever fixture shape you settle on. Will review the refreshed diff here in the meantime and surface anything from the Task-side experience that's worth folding in - but overall thanks for all the updates. Looking great and cant wait for it!

@CaitieM20 CaitieM20 disabled auto-merge May 21, 2026 04:09
CaitieM20 added 3 commits May 20, 2026 21:42
* remove duplicates in index, rename test cases to be consistent

* add negative tests

* refactor into everything-server and update negative tests

* fix conformance tests
@CaitieM20 CaitieM20 enabled auto-merge (squash) May 21, 2026 15:28
@CaitieM20 CaitieM20 requested a review from felixweinberger May 21, 2026 18:08
CaitieM20 and others added 4 commits May 21, 2026 16:42
…quirement

The traceability schema recognizes only check/text/url/issue/excluded on a
requirement row. The 'note:' field on the scenario-gate rows was silently
dropped, so those 11 rows would have been ingested as ordinary requirement
rows whose text is not a spec sentence, inflating the SEP-2322 requirement
count on the traceability dashboard.

- Remove the 10 flow-gate rows (sep-2322-*-complete, sep-2322-multi-round-r*,
  sep-2322-non-tool-*) plus sep-2322-multiple-inputs-incomplete. The checks
  are unchanged and still emitted by the scenarios; their IDs now surface in
  the manifest's 'untracked' list, which is the designed home for scenario
  scaffolding that doesn't map to an RFC-2119 sentence.
- Move 'inputRequests keys ... MUST be unique' to the excluded list:
  duplicate JSON object keys are collapsed by the parser before the harness
  can observe them, so the requirement is not testable at the protocol
  level. The check previously paired with it actually verifies that the
  server returns three inputRequests of different method types, which is a
  flow gate, not a key-uniqueness test.
The spec says the client MUST echo back the exact value of requestState and
MUST NOT inspect, parse, or modify it. The check previously parsed the
returned state as JSON and compared two fields, so a client that
deserialized the state and re-serialized it (different key order,
whitespace, extra fields) would still pass despite having modified the
opaque value.

Store the exact string the mock server sent and compare the echoed value
with strict string equality instead. Also include the sent value in the
check details so a mismatch is diagnosable from the report.
Copy link
Copy Markdown
Member

@pcarleton pcarleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Left 2 small tweaks in follow-up commits, i'll optimistically merge, but lmk if you disagree with either

@CaitieM20 CaitieM20 merged commit 43fbf60 into modelcontextprotocol:main May 22, 2026
4 checks passed
pcarleton added a commit that referenced this pull request May 22, 2026
* chore: refresh SEP traceability manifest (typescript-sdk@main)

Regenerated from a client+server suite run against typescript-sdk@5fc42e9be115
following the recipe in .github/workflows/traceability.yml.

New entries since the last refresh (typescript-sdk@22595b96):
- SEP-2322 (MRTR, #188): 17 tested, 0 untested, 16 excluded, 3 untracked
- SEP-2549 (TTL for list results, #275): 7 tested, 0 untested, 13 excluded
- SEP-2260: 12 excluded rows, no checks
- SEP-2207: yaml rows added since the last refresh now appear
  (1 tested, 1 untested: sep-2207-server-no-offline-access)

No previously-tested requirement regressed.

* Exclude sep-2207 server offline_access guidance until RS auth scenarios exist

sep-2207-server-no-offline-access was declared in the yaml but no scenario
emits it, so it surfaced as the only untested requirement in the refreshed
manifest. The check needs to probe the SDK server's Protected Resource
Metadata scopes_supported and WWW-Authenticate challenge scope, and the
server suite does not yet exercise the SDK server as an OAuth protected
resource at all.

Mark the requirement excluded with a pointer to #116 (server-side
authorization baseline) rather than leaving it as a permanently-untested
row; revisit when server-side authorization scenarios land.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants